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SIMPLIFIED FORMULAS FOR STANDARD ERRORS IN 
MAXIMUM LIKELIHOOD FACTOR ANALYSIS 

ABSTRACT 

Standard errors for tmximum likelihood estimates of factor loadings 
are expressed in terms of the inverse of an augmented information taatrix* 
This formulation arises naturally by viewing the problem as one in con- 
strained maximum likelihood estimation. The constraints correspond to the 
form of rotation used. Results are given for canonical rotation and 
analytic rotations in the orthomax family. 



SIiVJ>LIFIED TOI^MULA.S FOR STAITOARD ERRORS IN 
MAXIMUM LIKELIHOOD FACTOR ANALYSIS 



1. 



TNTRODUCTION 



Formulas for the asymptotic standard errors of unrotated loading 
estimates v/hich arise in factor analysis were given in an important paper 
by Lavley (l967)« These were extended to analytically rotated loadings by 
Archer & Jennrich (1975) for the orthogonal case and by Jennrich (1975) 
for the oblique case. The results of the latter authors applied also to 
principal components analysis* Factor analysis is one of the most popular 
statistical methodologies* Literally hundreds of factor analysis programs 
produce thousands of estimates every day, but not a single standard error. 
In part this is due to ttie fact that the standard error formulas which are 
presently available are fairly complicated in form. It turns out, however, 
that considerably simpler formulas may be obtained in the maximum likeli- 
hood case by viewing the problem as one in constrained estimation. In a 
sense this should not be a surprise. One of the advantages of maximum 
likelihood estimation is summarised it. the familiar formula 



which states that the asymptotic covariance matrix for a maximum likelihood 
estiftator of a parameter vector 9 is simply the inverse of the population 
information matrix for that parameter vector. This result holds in a 
slightly altered form when 0 is assumed to satisfy a set of functional 
constraints. The required alteration will be discussed in Section 2. 



acov 




(9) 



(1) 



In inaximuin likeli.ioed factor analysis (see, e.g., Anderson & Rubin, 
1956 or Harman, I967) estiraation is based on a sample of size n from a 
multivariate normal population of score vectors. In the case of p scores 
and k orthogonal factors the covariance matrix of this population has the 
form 

E - AA' + ilf (2) 

where A is a p by k matrix of (natural) factor loadings and \^ 

is a p by p diagonal matrix of onique variances • Since L is 
unchanged when A is replaced by AT for any orthogonal T , A is not 
determined by E and hence cannot be consistently estimated. This in- 
determinacy is eliminated by employing a variety of rotation criteria. 
These may be viev/ed as constraints so that maximum likelihood factor 
analysis is in fact constrained maximum likelihood estimation. 

While the decomposition of L given in (2) is the simplest, the de- 
composition most frequently encountered in practice involves standardized 
loadings. The standardized loadings are given by 

0^. = A. /0. , 1 < i < p, 1 < r < k (5) 

xr ^r' X ^ ^ ^ ^ \y/ 

X 

where the 7^. are the natural loadings from (2) and the 0 = (0..)^ 
jr c \ / i ^ ix' 

are the score standard deviations* In terms of the standardized loadings, 
the population correlation matrix P has a decomposition 

P = M' + r (If) 



which is of the same I'orm as that given in (2) for the population covari- 
ance matrix. Here A is a p by k matrix of standardized loadings a^^ 
and r is a p by p diagonal matrix of standardized unique variances. 
Because the diagonal of P is the p by p identity matrix I , it follows 
easily from (k) that 

P - I + ndg (5) 

where ndg AA' denotes the nondiagonal part of AA' . If X is the 
p by p diagonal matrix with diagonal elements 0^ , then 

£ = XEX 

and using (5), 

E = X(I + ndg AA')X • (7) 

This gives a parameterization of Z in tems of standardized loadings 
and score standard deviations. It will be used in Section 6 to find stand- 
ard errors for standardized loading estimates. 

2. STANDARD ERRORS FOR CONSTRAINED B'lAXIMlM LIKELIHOOD ESTIMATORS 

As observed earlier, rotation criteria impose constraints on factor 
loadings, here we consider a general result on the distribution of con- 
strained maximum likelihood estimators to be used in the following sections. 
Let 9 be a constrained maximum likelihood estimator of a parameter vector 
0 = (9^,...,©^) which is assumed to satisfy constraints 
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g^(e) - 0 ; i ^ l,...,r . (8) 

Let *:J(6) be the population information matrix and let dg/de denote 

the q by r matrix of partial derivatives dg./dO. , If 

5 i 



f-h9) iV' Ln^) 



^ d9 



(9) 



i.e.^ if ^-i*^ (9) is the q by q tratrix in the upper left-hand corner 
of the inverted augmented information matrix on the left, then 

acov 9 = J'^(e) . (10) 

This result may be found in Silvey (l971^ p. Si) • Sufficient regularity 
conditions are that *J(Q) and dg/d9 exist and are continuous in a 
neighborhood of the population value of 9 ^ thit the indicated inverse 
exists and that 9 is consistent* As expressed in (lO) the result 
here is similar in form to that in the unconstrained case. It is easy to 
show that is a pseudo-inverse of *J (9) ♦ In general it will 

not be a Moore-Penrose inverse so that in general x^^'C®) is not a 
pseudo-inverse of ^(9) • The motivation for the notation is that rj} (9) 
is obtained from the inversion of an augmented information matrix. 

The information matrix *J)(9) may^ and in our applications v;ill^ 
be singular. The constraints^ hov/ever, must be sufficient to identify 
the parameter and in particular sufficient to make the augmented 
information matrix invertible. 



5. THE II^PORMATION MATRIX FOR A NORMAL POPULATION 

The multivariate normal population density for a score vector z is 

f(z) = (2^)"^/^ izl"^ exp(4(z « pyz^iz - ^) (11) 

where |z| denotes the determinant of Z and \x denotes the mean of z . 
Since we are interested in the factor analytic structure of Z and since 
the distribution of the maximum likelihood estimate of this structure does 
not depend on \i we may assume without loss of generality that = 0 . 
With this assumption the population information for any pair of parameters 
cx and 3 is 

= E( g| log f)( ^ log f) (12) 
= 2 tr(Z ^ ) 



oZ 

(see, e.g., Jennrich, 1970 )• Here *rr denotes the p by p matrix of 

act 

partial derivatives ha. ,/ba of the components_^c^.- of Z with respect 
to an arbitrary parameter a . 

h. NATURAL LOADINGS WITH CANONICAL ROTATION 

This case is considered first because it is the simplest and may most^ 
readily be compared to previously published results. Canonical rotation 
means that the factors are orthogonal and the loadings satisfy the 
constraint: 

A'i^""A is diagonal . (l5) 
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As its name implies this is the rotation v;hich arises naturally in canoni- 
cal factor analysis (Rao, 1955)« 

Differentiating Z with respect to the natural loadings 7^^^ and 
unique variances '^^ gives 

^Z ^ At A T At ATt 

xr ir ir 

SfT' ^ii 

where J. is a p by k unit matrix with a one in row i and column r 
xr 

and zeros elsewhere. Similarly K^^ is a p by p unit matrix with a one 
in the i -th diagonal position and zeros elsewhere. Inserting these 
derivatives in (l2) and simplifying one finds that the information matrix 
relative to the parameters and is given by ohe pleasingly simple 

formulas: 

^ ir^ js' ^ 'rs ^ 'is ^ 'jr 

where 1 < i, j < p and 3. < r, s < k . Here ( • denotes the element 

in row r and column s of the matrix inside the parentheses and a^"^ 
denotes (Z*"^). . • 

Using (15) the constraint functions associated v/ith canonical 
rotation are: 



S^yi^,-^) = , 1 < u < V < k . 



(15) 



O These have derivatives: 

ERIC 
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dg 

= (5 A. 4-5' a! 

OA. ^ ru IV rv lu^ i 



ir 

of. iu iv i 

1 

for l<i<P, l:lr<k, and l<u<v<k. Here 8^^ denotes the 
Kronecker delta. 

From the general result (lO) the asymptotic covariance matrix for the 

maximum likelihood estimators of the A. and t. constrained by canoni- 

ir J 

cal rotation is the square matrix of order p(k + i) in the upper left- 
hand corner of the inverse of the augmented information matrix: 



5^ sf- V 



(17) 



To specify this matrix uniquely one must specify some order for the param- 
eters and constraints. For example, he may use 

^ir ' l<i<P > l<r<k (;l8) 
ordered lexicographically on i and r followed by 

^ 1 < J < P (19) 
in natural order and finally by 



-8- 

> ^<^<^<^ (20) 

in lexicographic order on u and v . 

Table 1 gives maximum likelihood estimates for a matrix of canonically 
rotated factor loadings obtained by Lawley & Maxwell (1971, p. k3). 

Insert Table 1 about here 



Table 2 gives the corresponding asymptotic standard errors computed by 
Lawley & Maxwell (1971^ 65) assuming a sample size n = 211 • 



Insert Table 2 about here 



Table 5 contains the same standard errors using the formulas derived here. 



Insert Table 5 about here 



Differences between the results in Tables 2 and 3 may be traced to a 
slight problem in Lawley' s formulas (j^nnrich & Thayer, 1975). In 
terms of the presentation by La^^rley & Ma>n'/ell (l97l) this may be cor- 
rected by inserting a 0^ immediately in front of the summation sign in 
equation (5»27)» V/hen this is done the values obtained agree to v/ithin 
one digit in the last decimal place with the results presented in Table 
5* 



5* NATURAL LOADINGS WITH ANALYTIC ROTATION 

in maximum likelihood factor analysis canonically rotated loadings 
are commonly referred to as unrotated loadings because these are the 
initial loadings produced by most estimation algorithras. Usually one is 
interested in other rotations. In the orthogonal case these include pre- 
' dominant ly quart imax^ varimax^ and equamax all of which are members 
of the orthomax family (Harman^ 19^0^ P« 55^) • As in canonical rotation, 
the loadings obtained from analytic rotation satisfy constraints which 
are associated with the form of rotation used* Archer & Jennrich (1975) 
have given a general method to generate constraints from arbitrary 
orthogonal rotation criteria. For the quartimx family they give 
constraint functions g^^ whose derivatives with respect to the loadings 
are given by: ' 

^UV c e /X 

-rr — = o a. - 6 a. (21) 
dA. ur luv vr ivu ^ ^ 

ir 

for l<i<P, l<r<k, and 1 < u < v < k where ' 

a. = ;\? - 37^^ -K, - 21 ij,^ ((A»A) - (A'A) ) - 2\, (A'A) ] 
luv IV lu IV p iv^^ ^ 'uu' iu^ 'uv 

for 1 < i < p and 1 < u^v < k . Here y = 0,1, k/2 corresponds to 

quartimax, varimax, and equamax rotation respectively. In the case of 

quartimax rotation and in fact in the case of analytic rotation generally, 

the constraints do not involve the unique variances . Thus 
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^^=0 , l<i<P > l<a<v<m 
i 



(25) 



Since only the constraint functions have been changed, the formulas for 
the asymptotic standard errors in the case of natural ' * .^ith ortho- 
max rotation are precisely the same as those in the previous section 
except that the constraint derivatives are now defined by equations (22) 
and (23) instead of (16) • The augmented information matrix here has the 
form: 



(24) 



0 / 



Table k gives a /arirax rotation of the loadings given in Table 1 
and Table 5 contains the corresponding standard errors obtained by using 



Insert Tables k and 5 about here 



the formulas of this section. These standard errors v/ere also computed 
using Lawley^s formulas modified as observed in the previous section 
together with the results of Archer & Jennrich (1975) • The values 
obtained agreed exactly (when rounded to three decimal places) with 
those presented in Table 5. 
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6. STANDAftDIZED LOADINGS WITH ANALYTIC 'ROTATION 

As observed earlier this is probably the most important case. We 

are interested here in parameterizing 2 in terms of the standardized 

loadings a. and the score standard deviations 0. as displayed in 
xr J " 

(?)• Since the maximum likelihood estimates of the a. are invariant 

xr 

under changes of scale in the score vector population, we may assume 
witliout loss of generality that each = 1 for the purpose of computing 
standard errors for the standardized loading estimates. Using (l2) and 
being careful to employ this assumption only after dii'ferentiation, the 
population information matrix is found to be given by: 



+ 2{p'^^fa. a. 

ir js 

4(a. ,a.) = p^-^a. 5..(p"-'-a). - 25. .p^^a. 



(25) 



(o.,a.) = 5. . + p. .p^' 



where 1 < i,^ <v , l<r, s<k, and p^ . and p denote components 
of the matrices P and P""^ respectively. These formulas are a little 
but not a great deal more complicated than the corresponding formulas 
(1^) in Section k. 
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In the case of orthomax rotation, the constraints are those of the 
previous section applied to A instead of A . It follows from (lO) 
tnat the asymptotic covariance matrix for the maximum likelihood estimates 
of the, a^^ and c (assuming the latter have true value one) is the 
•square matrix of order p(k l) in the upper left-hand corner of the 
inverse of the augmented information matrix: 



M(A,A) 



J(A,X) 

Ji(x,x) 



1^ 

0 



(26) 



Table 6 contains standard errors for the varimax rotated loadings- 



Insert Table 6 about here 



given in Table k using the results of this section to correct for 
standardization. We note in passing that v/hen corrected for standardization 
every loading except one has a smaller asymptotic standard error. 

As in the previous section, Lawley's modified formulas together with 
the results of Archer & Jennrich ^^^ere used to check the standard errors 
in Table 6. The results agreed to within one digit in the last decimal 
place presented. 

7- DISCUSSION 

We have seen that relatively simple formulas result when asymptotic 
standard errors for analytically rotated factor loadings are obtained by 
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inverting an augmented information matrix. From a practical point of vievj 
further simplification results from the fact that the formulas may be 
--^sily implemented as an independent computer module v/ith simple input. 
For example^ in the case of standardized loadings \i±th orthomax rotation 
the required input consists of the rotated standardized loadings matrix 
A , the orthomax parameter 7 ^ and the sample size n • Thus it is easy 
to modify an arbitrary maximum likelihood factor analysis program to 
produce asymptotic standard errors. 

As always^ however^ there are trade-offs. The information matrix 
method is not as computationally efficient as that based on the formulas 
of Lawley. It requires the inversion of a large matrix whose order is 
equal to the number of parameters plus the number of constraints. In the 
example there v;ere 27 loadings^ 9 unique variances (or score standard 
deviations)^ plus 5 constraints r.iaking a j<) by 59 matrix. Such a matrix, 
while large^ is about the same size as that v/hich is inverted in a linear 
regression problem with the same n'omber of parameters so the cost per 
standard error is no worse than in linear regression. This is perhaps 
only minor comfort since factor analysis typically involves considerably 
more parameters than regression analysis. It should be observed that 
v/ithin limits matrix inversion is not a terribly expensive operation. 
A medium speed computer such as the IBM 560/65 will invert the 59 by 59 
matrix in our example in v/ell under a second. One might also worry 
about precision problems resulting from the need to invert a large 
matrix, v/e know very little about the precision of this method compared 



'Ik' 



to that based on Lawley's results. We observe only that numerically 
accurate inversion is not difficult simply because a matrix is large (con- 
sider the identity matrix)^ nor is a method based on matrix inversion 
necessarily less accurate than one wliich involves primarily matrix 
multiplication. Fortunately it is not difficult to monitor the precision 
of a matrix inversion. While we have reason to believe the results in 
our examples are accurate to about 5 significant digits, the whole area 
of standard errors in factor analysis seems to be de\eloping too rapidly 
:o invest a great deal of effort in problems of numerical precision at 
this time. 

The results derived here apply to maximum likelihood factor analysis. 
Unlike the results of Archer Jennrich (1975) they are not easily 
adapted to other methods such as principal components analysis. The 
results here are also limited in that the oblique case has not been 
included. This is because in the oblique case with standardized loadings 
the information matrix becomes too messy to be included in a discourse 
v/hose title asserts simplification. 

Pfrhaps the most important use of the results derived here is that 
of verifying results derived elsev/here. They have already proved useful 
in uncovering a problem in Lawley's results and verifying those of Archer 
& Jennrich, 

The author is indebted to Allen Yates and Thomas Stroud for their 
detailed review and suggestions for improvement of this manuscript and 
to Karl JiJreskog for providing a much needed reference. 
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Table 1. --Canon ically Rotated Maximum Likelihood Loadings 







Factor 






iate 


I 


II 


III- 


Communality 


1 


.66I1 


.521 


.01k 


.550 


2 


.689 


.2U7 


-.195 


.575 


5 


.1)95 


.502 


-.222 


.585 


I1 


.857 


-.292 


-.055 


.7B8 


5 


.705 


-515 


-.155 


.619 


6 


.819 


-.577 


.105 


.825 


7 


.661 


.596 


-.078 


.600 


8 


.14 58 


.296 


.ii9l 


.558 


9 


.766 


.ii27 


-.012 


.769 
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Table 2. --Standard Errors for the Loadings in Table 1 Using 
Lawley^s (1967) Formulas 



Factor 

Variate I II III 



1 


.066 


.058 


.076 


2 




.061 


.068 


5 


.070 


.071 


.085 


k 


.060 


.0146 




5 


.065 


.057 


.072 


6 


.06l4 


.0146 


.057 


7 


.068 


.057 


.066 


8 


.075 


.095 


.1U2 


9 


.066 


.0147 


.055 
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Table 3* --Standard Errors for the Loadings in Table 1 Usin^ 
the Inverse of the Augmented Information Matrix 







Factor 




Variate 


I 


II 


III 


1 


.068 


.066 


.091 


2 


.066 


.076 


.069 


5 


.072 


.081 




k 


.061 


.068 


.05k 


5 


.067 


.075 


.081 


6 


.069 


.01)9 


.056 


7 


.071 


.068 


.080 


8 


.076 


.105 


.12l< 


9 


.070 


.058 


.078 
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Table if.— Varimax Rotation of the Loadings in Table 1 







ractoi' 






Variate 


I 


II 


III 


Communality 


1 


.605 


.506 


.500 




2 


.656 


.576 


.056 


.575 


5 


.589 


.190 


-.022 


.38h 


li 


.298 


.852 


.081 


•787 


5 


.2i|5 


.lk6 


-.065 


.620 


6 


.178 


.870 


.187 


.82lj 


7 


.709 


.258 


.175 


.600 


8 


.525 


.158 


.61*0 


.558 


9 


.772 


.519 


.269 


.769 
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Table 5* --Standard Errors for the Loadings in Table h 
Ignoring Standardization 







Factor 




•iate 


I 


II 


III 


1 


.085 


.059 


.115 


2 


.067 


.060 


.128 


3 


.071 


.065 


.150 


k 




.060 


.OkJ 


5 


.052 


.065 


.095 


6 


.0I18 


.061 


.0!48 


7 


.072 


.055 


.120 


8 


.152 


.058 


.200 


9 


.081 


.052 


.121 



Tab\e 6> --Standard Errors for the Loadings in Table k 
Corrected for Standardization 







Factor 




Variate 


I 


TI 


III 


1 


.066 


.052 


.105 


2 


.01+8 


.052 


.120 


5 


.056 


.060 


.12ll 


li 


.0I4I 


.029 


.070 


5 


.Oil 9 


.057 


.092 


6 


.058 


.052 


.Okk 


7 


.0I48 


.050 


.109 


8 


.12I4 


.055 


.190 


9 


.055 


.Ol|l| 


.109 



